IEEE INFOCOM 2021

Session C-7

Containers and Data Centers

Conference

10:00 AM — 11:30 AM EDT

Local

May 13 Thu, 7:00 AM — 8:30 AM PDT

Exploring Layered Container Structure for Cost Efficient Microservice Deployment

Lin Gu (Huazhong University of Science and Technology, China); Deze Zeng (China University of Geosciences, China); Jie Hu and Hai Jin (Huazhong University of Science and Technology, China); Song Guo (Hong Kong Polytechnic University, Hong Kong); Albert Zomaya (The University of Sydney, Australia)

0

Container, as a light-weight virtualization technology with the advantages of continuous integration and easy deployment, has been widely adopted to support diverse microservices. At runtime, non-local container images need to be frequently pulled from remote registries to local servers, resulting in large pulling traffic and hence long startup time. A distinctive feature in container-based microservice, which has not been exploited, is that container images are in layered structure and some common base layers can be shared between co-located microservices.

In this paper, we propose a layer sharing microservice deployment and image pulling strategy which explores the advantage of layer sharing to speedup microservice startup and lower image storage consumption. The problem is formulated into an Integer Linear Programming (ILP) form. An Accelerated Distributed Augmented Lagrangian (ADAL) based distributed algorithm executed cooperatively by registries and servers is proposed. Through extensive trace driven experiments, we validate the high efficiency of our ADAL based algorithm as it accelerates the microservice startup by 2.30 times in average and reduces the storage consumption by 55.33%.

NetMARKS: Network Metrics-AwaRe Kubernetes Scheduler Powered by Service Mesh

Łukasz Wojciechowski (Samsung R&D Institute Poland, Poland); Krzysztof Opasiak and Jakub Latusek (Warsaw University of Technology & Samsung R&D Institute Poland, Poland); Maciej Wereski (Samsung R&D Institute Poland, Poland); Victor Morales (Samsung Research America, USA); Taewan Kim (Samsung Research, Samsung Electronics Co., Ltd., Korea (South)); Moonki Hong (Samsung Electronics, Co., Ltd., Korea (South))

0

Container technology has revolutionized the way software is being packaged and run. The telecommunications industry, now challenged with the 5G transformation, views containers as the best way to achieve agile infrastructure that can serve as a stable base for high throughput and low latency for 5G edge applications. These challenges make optimal scheduling of performance-sensitive containerized workflows a matter of emerging importance. Meanwhile, the wide adoption of Kubernetes across industries has placed it as a de-facto standard for container orchestration. Several attempts have been made to improve Kubernetes scheduling, but the existing solutions either do not respect current scheduling rules or only considered a static infrastructure viewpoint. To address this, we propose NetMARKS - a novel approach to Kubernetes pod scheduling that uses dynamic network metrics collected with Istio Service Mesh. This solution improves Kubernetes scheduling while being fully backward compatible. We validated our solution using different workloads and processing layouts. Based on our analysis, NetMARKS can reduce application response time up to 37 percent and save up to 50 percent of inter-node bandwidth in a fully automated manner. This significant improvement is crucial to Kubernetes adoption in 5G use cases, especially for multi-access edge computing and machine-to-machine communication.

Optimal Rack-Coordinated Updates in Erasure-Coded Data Centers

Guowen Gong, Zhirong Shen and Suzhen Wu (Xiamen University, China); Xiaolu Li and Patrick Pak-Ching Lee (The Chinese University of Hong Kong, Hong Kong)

0

Erasure coding has been extensively deployed in today's data centers to tackle prevalent failures, yet it is prone to give rise to substantial cross-rack traffic for parity update. In this paper, we propose a new rack-coordinated update mechanism to suppress the cross-rack update traffic, which comprises two successive phases: a delta-collecting phase that collects data delta chunks, and another selective parity update phase that renews the parity chunks based on the update pattern and parity layout. We further design RackCU, an optimal rack-coordinated update solution that achieves the theoretical lower bound of the cross-rack update traffic. We finally conduct extensive evaluations, in terms of large-scale simulation and real-world data center experiments, showing that RackCU can reduce 22.1%-75.1% of the cross-rack update traffic and hence improve 34.2%-292.6% of the update throughput.

Primus: Fast and Robust Centralized Routing for Large-scale Data Center Networks

Guihua Zhou, Guo Chen, Fusheng Lin, Tingting Xu, Dehui Wei and Jianbing Wu (Hunan University, China); Li Chen (Huawei, Hong Kong); Yuanwei Lu and Andrew Qu (Tencent, China); Hua Shao (Tsinghua University & Tencent, China); Hongbo Jiang (Hunan University, China)

0

This paper presents a fast and robust centralized data center network (DCN) routing solution called Primus. For fast routing calculation, Primus uses centralized controller to collect/disseminates the network's link-states (LS), and offload the actual routing calculation onto each switch. Observing that the routing changes can be classified into a few fixed patterns in DCNs which have regular topologies, we simplify each switch's routing calculation into a table-lookup manner, i.e., comparing LS changes with pre-installed base topology and updating routing paths according to predefined rules. As such, the routing calculation time at each switch only needs 10s of us even in a large network topology containing 10K+ switches. For efficient controller fault-tolerance, Primus purposely uses reporter switch to ensure the LS updates successfully delivered to all affected switches. As such, Primus can use multiple stateless controllers and little redundant traffic to tolerate failures, which incurs little overhead under normal case, and keeps 10s of ms fast routing reaction time even under complex data-/control-plane failures. We design, implement and evaluate Primus with extensive experiments on Linux-machine controllers and white-box switches. Primus provides ∼1200x and ∼100x shorter convergence time than current distributed protocol BGP and the state-of-the-art centralized routing solution, respectively.

Session Chair

Wei Wang (Hong Kong University of Science and Technology)

Session C-8

Sea, Space and Quantum Networks

Conference

12:00 PM — 1:30 PM EDT

Local

May 13 Thu, 9:00 AM — 10:30 AM PDT

PolarTracker: Attitude-aware Channel Access for Floating Low Power Wide Area Networks

Yuting Wang, Xiaolong Zheng, Liang Liu and Huadong Ma (Beijing University of Posts and Telecommunications, China)

1

Low Power Wide Area Networks (LPWAN) such as Long Range (LoRa) show great potential in emerging aquatic IoT applications. However, our deployment experience shows that the floating LPWAN suffer significant performance degradation, compared to the static terrestrial deployments. Our measurement
results reveal the reason behind this is due to the polarization and directivity of the antenna. The dynamic attitude of a floating node incurs varying signal strength losses, which is ignored by the attitude-oblivious link model adopted in most of the existing methods. When accessing the channel at a misaligned attitude, packet errors can happen. In this paper, we propose an attitude-aware link model that explicitly quantifies the impact of node attitude on link quality. Based on the new model, we propose PolarTracker, a novel channel access method for floating LPWAN. PolarTracker tracks the node attitude alignment state and schedules the transmissions into the aligned periods with better link quality. We implement a prototype of PolarTracker on commercial LoRa platforms and extensively evaluate its performance in various real-world environments. The experimental results show that PolarTracker can efficiently improve the packet reception ratio by 48.8%, compared with ALOHA in LoRaWAN.

Mobility- and Load-Adaptive Controller Placement and Assignment in LEO Satellite Networks

Long Chen, Feilong Tang and Xu Li (Shanghai Jiao Tong University, China)

1

Software-defined networking (SDN) based LEO satellite networks can make full use of satellite resources through flexible function configuration and efficient resource management of controllers. Consequently, controllers have to be carefully deployed based on dynamical topology and time-varying workload.
However, existing work on controller placement and assignment is not applicable to LEO satellite networks with highly dynamic topology and randomly fluctuating load. In this paper, we first formulate the adaptive controller placement and assignment (ACPA) problem and prove its NP-hardness. Then, we propose the control relation graph (CRG) to quantitatively capture the control overhead in LEO satellite networks. Next, we propose the CRG-based controller placement and assignment (CCPA) algorithm with a bounded approximation ratio. Finally, using the predicted topology and estimated traffic load, a lookahead-based improvement algorithm is designed to further decrease the overall management costs. Extensive emulation results demonstrate that the CCPA algorithm outperforms related schemes in terms of response time and load balancing.

Time-Varying Resource Graph Based Resource Model for Space-Terrestrial Integrated Networks

Long Chen and Feilong Tang (Shanghai Jiao Tong University, China); Zhetao Li (Xiangtan University, China); Laurence T. Yang (St. Francis Xavier University, Canada); Jiadi Yu and Bin Yao (Shanghai Jiao Tong University, China)

1

It is critical but difficult to efficiently model resources in space-terrestrial integrated networks (STINs). Existing work is not applicable to STINs because they lack the joint consideration of different movement patterns and fluctuating loads.

In this paper, we propose the time-varying resource graph (TVRG) to model STINs from the resource perspective. Firstly, we propose the STIN mobility model to uniformly model different movement patterns in STINs. Then, we propose a layered Resource Modeling and Abstraction (RMA) approach, where evolutions of node resources are modeled as Markov processes, by encoding predictable topologies and influences of fluctuating loads as states. Besides, we propose the low-complexity domain resource abstraction algorithm by defining two mobility-based and load-aware partial orders on resource abilities. Finally, we propose an efficient TVRG-based Resource Scheduling (TRS) algorithm for time-sensitive and bandwidth-intensive data flows, with the multi-level on-demand scheduling ability. Comprehensive simulation results demonstrate that RMA-TRS outperforms related schemes in terms of throughput, end-to-end delay and flow completion time.

Redundant Entanglement Provisioning and Selection for Throughput Maximization in Quantum Networks

Yangming Zhao and Chunming Qiao (University at Buffalo, USA)

1

Quantum communication using qubits based on the principle of entangled photons is a promising solution to improve network security. However, it is difficult to successfully create an entanglement link or connection between two nodes, especially when they are far apart from each other. In addition, only one qubit can be exchanged over an established entanglement connection, resulting in a low throughput.

In this paper, we propose Redundant Entanglement Provisioning and Selection (REPS) to maximize the throughput for multiple source-destination (SD) pairs in a circuit-switched, multi-hop quantum network. REPS has two distinct features: (i). It provisions backup resources for extra entanglement links between adjacent nodes for failure-tolerance; and (ii). It provides flexibility in selecting successfully created entanglement links to establish entanglement connections for the SD pairs to achieve network-wide optimization. Extensive analysis and simulations show that REPS can achieve optimal routing with a high probability, and improves the throughput by up to 68.35% over the highest-performing algorithms in existence. In addition, it also improves the fairness among the SD pairs in the networks.

Session Chair

Ana Aguiar (University of Porto, Portugal)

Session C-9

Social Networks and Applications

Conference

2:30 PM — 4:00 PM EDT

Local

May 13 Thu, 11:30 AM — 1:00 PM PDT

Medley: Predicting Social Trust in Time-Varying Online Social Networks

Wanyu Lin and Baochun Li (University of Toronto, Canada)

0

Social media, such as Reddit, has become a norm in our daily lives, where users routinely express their attitude using upvotes (likes) or downvotes. These social interactions may encourage users to interact frequently and form strong ties of trust between one another. It is therefore important to predict social trust from these interactions, as they facilitate routine features in social media, such as online recommendation and advertising.

Conventional methods for predicting social trust often accept static graphs as input, oblivious of the fact that social interactions are time-dependent. In this work, we propose Medley, to explicitly model users' time-varying latent factors and to predict social trust that varies over time. We propose to use functional time encoding to capture continuous-time features and employ attention mechanisms to assign higher importance weights to social interactions that are more recent. By incorporating topological structures that evolve over time, our framework can infer pairwise social trust based on past interactions. Our experiments on benchmarking datasets show that Medley is able to utilize time-varying interactions effectively for predicting social trust, and achieves an accuracy that is up to 26% higher over its alternatives.

Setting the Record Straighter on Shadow Banning

Erwan Le Merrer (Inria, France); Benoit Morgan (IRIT-ENSEEIHT, University of Toulouse, France); Gilles Tredan (LAAS-CNRS, France)

1

Shadow banning consists for an online social network in limiting the visibility of some of its users, without them being aware of it. Twitter declares that it does not use such a practice, sometimes arguing about the occurrence of "bugs" to justify restrictions on some users. This paper is the first to address the plausibility of shadow banning on a major online platform, by adopting both a statistical and a graph topological approach. We first conduct an extensive data collection and analysis campaign, gathering occurrences of visibility limitations on user profiles (we crawl more than 2.5 millions of them). In such a black-box observation setup, we highlight the salient user profile features that may explain a banning practice (using machine learning predictors). We then pose two hypotheses for the phenomenon: i) limitations are bugs, as claimed by Twitter, and ii) shadow banning propagates as an epidemic on userinteraction ego-graphs. We show that hypothesis i) is statistically unlikely with regards to the data we collected. We then show some interesting correlation with hypothesis ii), suggesting that the interaction topology is a good indicator of the presence of groups of shadow banned users on the service.

MIERank: Co-ranking Individuals and Communities with Multiple Interactions in Evolving Networks

Shan Qu (Shanghai Jiaotong University, China); Luoyi Fu (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China)

0

Ranking has significant applications in real life. It aims to evaluate the importance (or popularity) of two categories of objects, i.e., individuals and communities. Numerous efforts have been dedicated to these two types of rankings respectively. Instead, in this paper, we for the first time explore the co-ranking of both individuals and communities. Our insight lies in that co-ranking may enhance the mutual evaluation on both sides. To this end, we first establish an Evolving Coupled Graph that contains a series of smoothly weighted snapshots, each of which characterizes and couples the intricate interactions of both individuals and communities till a certain evolution time into a single graph. Then we propose an algorithm, called MIERank to implement the co-ranking of individuals and communities in the proposed evolving graph. The core idea of MIERank lies in a novel unbiased random walk, which, when sampling the interplay among nodes over different generation times, incorporates the preference knowledge of ranking by utilizing nodes' future actions. MIERank returns the co-ranking of both individuals and communities by iteratively alternating between their corresponding stationary probabilities of the unbiased random walk in a mutually-reinforcing manner. We prove the efficiency of MIERank in terms of its convergence, optimality and extensiblity. Our experiments on a big scholarly dataset of 606862 papers and 1215 fields further validate the superiority of MIERank with fast convergence and an up to 26% ranking accuracy gain compared with the separate counterparts.

ProHiCo: A Probabilistic Framework to Hide Communities in Large Networks

Xuecheng Liu and Luoyi Fu (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China); John Hopcroft (Cornell University, USA)

0

While community detection has been one of the cornerstones in network analysis and data science, its opposite, community obfuscation, has received little attention in recent years. With the increasing awareness of data security and privacy protection, the need to understand the impact of such attacks on traditional community detection algorithms emerges. To this end, we investigate the community obfuscation problem which aims to hide a target set of communities from being detected by perturbing the network structure. We identify and analyze the Matthew effect incurred by the classical quality function based methods, which essentially results in the imbalanced allocation of perturbation resources. To mitigate such effect, we propose a probabilistic framework named as ProHiCo to hide communities. The key idea of ProHiCo is to first allocate the resource of perturbations randomly and fairly and then choose the appropriate edges to perturb via likelihood minimization. Our ProHiCo framework provides the additional freedom to choose the generative graph model with community structure. By incorporating the stochastic block model and its degree-corrected variant into the ProHiCo framework, we develop two scalable and effective algorithms called SBM and DCSBM. Via extensive experiments on 8 real-world networks and 5 community detection algorithms, we show that both SBM and DCSBM are about 30x faster than the prominent baselines in the literature when there are around 500 target communities, while their performance is comparable to the baselines.

Session Chair

Fabricio Murai (Universidade Federal de Minas Gerais, Brasil)

Session C-10

Memory Topics

Conference

4:30 PM — 6:00 PM EDT

Local

May 13 Thu, 1:30 PM — 3:00 PM PDT

Adaptive Batch Update in TCAM: How Collective Optimization Beats Individual Ones

Ying Wan (Tsinghua University, China); Haoyu Song (Futurewei Technologies, USA); Yang Xu (Fudan University, China); Chuwen Zhang (Tsinghua University, China); Yi Wang (Southern University of Science and Technology, China); Bin Liu (Tsinghua University, China)

1

Rule update in TCAM has long been identified as a key technical challenge due to the rule order constraint. Existing algorithms take each rule update as an independent task. However, emerging applications produce batch rule update requests. Processing the updates individually causes high aggregated cost which can strain the processor and/or incur excessive TCAM lookup interrupts. This paper presents the first true batch update algorithm, ABUT. Unlike the other alleged batch update algorithms, ABUT collectively evaluates and optimizes the
TCAM placement for whole batches throughout. By applying the topology grouping and maintaining the group order invariance in TCAM, ABUT achieves substantial computing time reduction yet still yields the best-in-class placement cost. Our evaluations show that ABUT is ideal for low-latency and high-throughput batch TCAM updates in modern high-performance switches.

HAVS: Hardware-accelerated Shared-memory-based VPP Network Stack

Shujun Zhuang and Jian Zhao (ShangHaiJiaoTong University, China); Jian Li (Shanghai Jiao Tong University, China); Ping Yu and Yuwei Zhang (Intel, China); Haibing Guan (Shanghai Jiao Tong University, China)

0

The number of requests to transfer large files is increasing rapidly in web server and remote-storage scenarios, and this increase requires a higher processing capacity from the network stack. However, to fully decouple from applications, many latest userspace network stacks, such as VPP (vector packet processing) and snap, adopt a shared-memory-based solution to communicate with upper applications. During this communication, the application or network stack needs to copy data to or from shared memory queues. In our verification experiment, these multiple copy operations incur more than 50% CPU consumption and severe performance degradation when the transferred file is larger than 32 KB.
This paper adopts a hardware-accelerated solution and proposes HAVS which integrates Intel I/O Acceleration Technology into the VPP network stack to achieve high-performance memory copy offloading. An asynchronous copy architecture is introduced in HAVS to free up CPU resources. Moreover, an abstract memcpy accelerator layer is constructed in HAVS to ease the use of different types of hardware accelerators and sustain high availability with a fault-tolerance mechanism. The comprehensive evaluation shows that HAVS can provide an average 50%-60% throughput improvement over the original VPP stack when accelerating the nginx and SPDK iSCSI target application.

Maximizing the Benefit of RDMA at End Hosts

Xiaoliang Wang (Nanjing University, China); Hexiang Song (NJU, China); Cam-Tu Nguyen (Nanjing University, Vietnam); Dongxu Cheng and Tiancheng Jin (NJU, China)

0

RDMA is increasingly deployed in data center to meet the demands of ultra-low latency, high throughput and low CPU overhead. However, it is not easy to migrate existing applications from the TCP/IP stack to the RDMA. The developers usually need to carefully select communication primitives and manually tune the parameters for each single-purpose system. After operating the high-speed RDMA network, we identify multiple hidden costs which may cause degraded and/or unpredictable performance of RDMA-based applications. We demonstrate these hidden costs including the combination of complicated parameter settings, scalability of Reliable Connections, two-sided memory management and page alignment, resource contention among diverse traffics, etc. Furthermore, to address these problems, we introduce Nem, a suite that allows developers to maximize the benefit of RDMA by i) eliminating the resource contention at NIC cache through asynchronous resource sharing; ii) introducing hybrid page management based on messages sizes; iii) isolating flows of different traffic classes based hardware features. We implement the prototype of Nem and verify its effectiveness by rebuilding the RPC message service, which demonstrates the high throughput for large messages, low latency for small messages without compromising the low CPU utilization and good scalability performance for a large number of active connections.

Session Chair

Xinwen Fu (U. Massachussets, Lowell)

Program at a Glance